These data have been run through:
1.) cutadapt to remove primers and adaptors
2.) dada2 for initial quality filtering, trimming, chimera removal
3.) MACSE to remove non-sense amino acid alignments according to invert protein translations
4.) Qiime2 was used to cluster OTUs at 97% resulting in 8432 OTUs
5.) Taxonomic assignment was conducted in three passes
a.) top hit at 95% match threshold using vsearch and CoArbitrator Database with HK Voucher sequences appended. (623 OTUs)
b.) top hit at 80% match threshold using vsearch and CoArbitrator Database with HK Voucher sequences appended. (2247 OTUs)
c.) blast+ search with LCA, any additional bacteria were also filtered here (1088 OTUs)
e.) remaining unassigned (4,447) were removed.
3985 OTUs of 8432 OTUs had some taxonomic assignment and went into the following analyses.
COI_data<-read_qza("5_Qiime/table-dn-97.qza")$data ##data table, table-dn-97.qza
COI_meta<-read.table("COI_Final_working/meta_slim.csv",header = T, sep = ",") ##5_Qiime/meta.csv
COI_taxa<-read.csv("COI_Final_working/T_build_f.csv",header=T) ##already lots of cleaning at this point
Do a bunch of data/file reformatting of the data to get into phyloseq, code not included
We focused on seven sites across Hong Kong. Deployed 3 ARMS at each site from Jan2018-Jan2019, and July2018-July2019.
Alt text
Based on environmental parameters measured monthly from Jan2018-July2019 they grouped roughly into “High” and “Low” impact with Sham Wan in the middle.
Alt text
| Station_ID | Impact | Latitude | Longitude | Collections | ARMS |
|---|---|---|---|---|---|
| BI | Low | 22.32429 | 114.3529 | Summer Winter 2019 | 19,20,21,40,41,42 |
| CDA | Low | 22.20781 | 114.2563 | Summer Winter 2019 | 22,23,24,43,44,45 |
| CI | High | 22.43969 | 114.2210 | 2017, Summer Winter 2019 | 7,8,9,13,14,15,34,35,36 |
| CLP | Med | 22.46363 | 114.2906 | 2017 | 4,5,6 |
| PC | High | 22.28916 | 114.0341 | Summer Winter 2019 | 31,32,33,52,53,54 |
| PI | Med | 22.50164 | 114.3564 | 2017 | 1,2,3 |
| SSW | High | 22.28182 | 113.8919 | Summer Winter 2019 | 28,29,30,49,50,51 |
| SW | Med | 22.18731 | 114.1354 | Summer Winter 2019 | 25,26-lost,27,46,47,48 |
| TPC | Low | 22.54397 | 114.4341 | 2017, Summer Winter 2019 | 10,11,12,16,17,18,37,38,39 |
boldAll Figures are order from left (High Impact) to right (Low Impact) along the x-axisbold
Accumulation curve indicates that taxa number is not entirely leveled off yet
## Warning in cor(x > 0): the standard deviation is zero
Sequence counts can be bias but just to have a look across various levels of clustering
Dropping abundance data and focusing on presence and absence of OTUs
Seasonality had some effect with Winter ARMS (i.e., Jan2018-Jan2019) being less diverse at some sites but not all
Plots were used to look at beta diversity using presence/absense data for each taxa (abundance data made things too messy)
Sites grouped together but the amount of variation explained by the axes was relatively low. Points represent individual size fractions from each ARMS and though not clearly shown here, size fractions did not explain variation well.
Thinking about whether there is large phyletic turnover across the sites, dropped uncommon phyla to improve plot resolution.
It appears that at least some OTUs within most phyla are spread across sites, so species turnover is mostly within Phyla.
However, Bacillariophyta, Rhodophyta, and maybe Sipuncula seem to be weighted in particular toward the low impact site on middle left.
Another way to visualize how Taxa/OTUs are distributed across Hong Kong is through network analyses. Below, nodes represents and ARMS and connections are made based on the proportion of shared taxa (presence/absense) when a threshold of shared taxa between ARMS is met.
Trends in Alpha diversity were not very clear. Decided to focus on the most abundant phyla and look at phylogenetic diversity as well. Pulled all OTUs matched to Mollusca, Arthropoda, and Annelida.
Then make alignments and trees in QIIME2 (1) Alignment - mafft [default settings] (2) Tree - iqtree [default settings] (3) Root tree - midpoint-root
Annelida and Arthropoda show clear differences between high impact (SSW, PC, CI) and low impact (CDA, BI, TPC) sites. SW also grouping with low impact sites
Mollusk diversity does not show the same pattern.